{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Fairness Exercise 1: Explore the Model",
      "provenance": [],
      "collapsed_sections": [
        "J3R2QWkru1WN",
        "UFBqqnRD-Zkj",
        "6KmrCS-uAz0s",
        "IvvxNMgM-6A2",
        "-J4hbOhgHZid",
        "tGyACRd8oFwP",
        "FQGWSdrJy08B",
        "LlkfgynX0yfF",
        "FBhBsevUOinO",
        "P5MBQR7EF6ny",
        "OaL3qgHCcmwG"
      ]
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "HnU0fNSuG2aD"
      },
      "source": [
        "# Fairness Exercise 1: Explore the Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "iVkPBosnIFlu"
      },
      "source": [
        "**Learning Objectives:**\n",
        "* Train a classifier to predict toxicity of text comments. \n",
        "* Explore the Civil Comments dataset and explore the toxic-text classifier's predictions using the What-If Tool.\n",
        "* Install and use Fairness Indicators to evaluate the toxic-text classifier's results.\n",
        "* Identify the source of bias in the classifier's predictions.\n",
        "\n",
        "**Overview**\n",
        "\n",
        "In this exercise, you'll use Fairness Indicators to evaluate a toxicity classifier trained exclusively on the text comments in the Civil Comments dataset.\n",
        "\n",
        "**About Fairness Indicators**\n",
        "\n",
        "Fairness Indicators is a suite of tools built on top of TensorFlow Model Analysis that enable regular evaluation of fairness metrics in product pipelines.\n",
        "\n",
        "Fairness Indicators makes it easy for you to ask questions about how your model performs for different groups of users. The suite includes [TensorFlow Data Validation](https://www.tensorflow.org/tfx/data_validation/get_started), [TensorFlow Model Analysis](https://www.tensorflow.org/tfx/tutorials/model_analysis/tfma_basic), and the [What-If tool](https://pair-code.github.io/what-if-tool/).\n",
        "\n",
        "These tools help you compute common classification fairness metrics and evaluate model performance for defined groups of users, and visualize comparisons to a baseline slice. You can evaluate pipelines of all sizes, and compare your results using different thresholds and confidence levels. Fairness Indicators allows you to deep-dive into individual slices and interrogate your dataset, adjusting confidence intervals and evaluations at multiple thresholds. \n",
        "\n",
        "Fairness Indicators is packaged with [TensorFlow Data Validation](https://www.tensorflow.org/tfx/data_validation/get_started) and [What-If Tool](https://pair-code.github.io/what-if-tool/) to allow users to evaluate the distribution of datasets and probe models down to the individual datapoint with the What-If Tool.\n",
        "\n",
        "For a closer look at the Fairness Indicators suite, check out this\n",
        "[link](https://github.com/tensorflow/fairness-indicators). To get started with Fairness Indicators, keep reading."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "g0PSuIW-I8iP"
      },
      "source": [
        "## Setup\n",
        "\n",
        "Run the following code to import the necessary dependencies for the libraries we'll be using in this exercise.\n",
        "\n",
        "First, run the cell below to install Fairness Indicators. \n",
        "\n",
        "**NOTE:** You **MUST RESTART** the Colab runtime after doing this installation, either by clicking the **RESTART RUNTIME** button at the bottom of this cell or by selecting **Runtime->Restart runtime...** from the menu bar above."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "2z_xzJ40j9Q-",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "!pip install fairness-indicators \\\n",
        "  \"absl-py==0.8.0\" \\\n",
        "  \"pyarrow==0.15.1\" \\\n",
        "  \"apache-beam==2.17.0\" \\\n",
        "  \"avro-python3==1.9.1\" \\\n",
        "  \"tfx-bsl==0.21.4\" \\\n",
        "  \"tensorflow-data-validation==0.21.5\""
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mdLlKWbIlxYH",
        "colab_type": "text"
      },
      "source": [
        "Next, import all the dependencies we'll use in this exercise, which include Fairness Indicators, TensorFlow Model Analysis (tfma), TensorFlow Data Validation (tfdv), and the What-If tool (WIT):"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "6E__x2XkJDFW",
        "colab": {}
      },
      "source": [
        "%tensorflow_version 2.x\n",
        "import os\n",
        "import tempfile\n",
        "import apache_beam as beam\n",
        "import numpy as np\n",
        "import pandas as pd\n",
        "from datetime import datetime\n",
        "\n",
        "import tensorflow_hub as hub\n",
        "import tensorflow as tf\n",
        "import tensorflow_model_analysis as tfma\n",
        "import tensorflow_data_validation as tfdv\n",
        "from tensorflow_model_analysis.addons.fairness.post_export_metrics import fairness_indicators\n",
        "from tensorflow_model_analysis.addons.fairness.view import widget_view\n",
        "from fairness_indicators.documentation.examples import util\n",
        "\n",
        "from witwidget.notebook.visualization import WitConfigBuilder\n",
        "from witwidget.notebook.visualization import WitWidget"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "dNfRkjWWIKIs"
      },
      "source": [
        "## Part I: Audit the Data\n",
        "In this section, you'll audit the Civil Comments dataset to proactively identify fairness considerations prior to training the model.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "J3R2QWkru1WN",
        "colab_type": "text"
      },
      "source": [
        "### About the Civil Comments dataset\n",
        "\n",
        "Click below to learn more about the Civil Comments dataset, and how we've preprocessed it for this exercise."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZZswcJJMCDjU",
        "colab_type": "text"
      },
      "source": [
        "The Civil Comments dataset comprises approximately 2 million public comments that were submitted to the [Civil Comments platform](https://medium.com/@aja_15265/saying-goodbye-to-civil-comments-41859d3a2b1d). [Jigsaw](https://jigsaw.google.com/) sponsored the effort to compile and annotate these comments for ongoing [research](https://arxiv.org/abs/1903.04561); they've also hosted competitions on [Kaggle](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification) to help classify toxic comments as well as minimize unintended model bias. \n",
        "\n",
        "#### Features\n",
        "\n",
        "Within the Civil Comments data, a subset of comments are tagged with a variety of identity attributes pertaining to gender, sexual orientation, religion, race, and ethnicity. Each identity annotation column contains a value that represents the percentage of annotators who categorized a comment as containing references to that identity. Multiple identities may be present in a comment.\n",
        "\n",
        "**NOTE:** These identity attributes are intended *for evaluation purposes only*, to assess how well a classifier trained solely on the comment text performs on different tag sets.\n",
        "\n",
        "To collect these identity labels, each comment was reviewed by up to 10 annotators, who were asked to indicate all identities that were mentioned in the comment. For example, annotators were posed the question: \"What genders are mentioned in the comment?\", and asked to choose all of the following categories that were applicable.\n",
        "\n",
        "* Male\n",
        "* Female\n",
        "* Transgender\n",
        "* Other gender\n",
        "* No gender mentioned\n",
        "\n",
        "**NOTE:** *We recognize the limitations of the categories used in the original dataset, and acknowledge that these terms do not encompass the full range of vocabulary used in describing gender.*\n",
        "\n",
        "Jigsaw used these ratings to generate an aggregate score for each identity attribute representing the percentage of raters who said the identity was mentioned in the comment. For example, if 10 annotators reviewed a comment, and 6 said that the comment mentioned the identity \"female\" and 0 said that the comment mentioned the identity \"male,\" the comment would receive a `female` score of `0.6` and a `male` score of `0.0`.\n",
        "\n",
        "**NOTE:** For the purposes of annotation, a comment was considered to \"mention\" gender if it contained a comment about gender issues (e.g., a discussion about feminism, wage gap between men and women, transgender rights, etc.), gendered language, or gendered insults. Use of \"he,\" \"she,\" or gendered names (e.g., Donald, Margaret) did not require a gender label. \n",
        "\n",
        "#### Label\n",
        "\n",
        "Each comment was rated by up to 10 annotators for toxicity, who each classified it with one of the following ratings.\n",
        "\n",
        "* Very Toxic\n",
        "* Toxic\n",
        "* Hard to Say\n",
        "* Not Toxic\n",
        "\n",
        "Again, Jigsaw used these ratings to generate an aggregate toxicity \"score\" for each comment (ranging from `0.0` to `1.0`) to serve as the [label](https://developers.google.com/machine-learning/glossary?utm_source=Colab&utm_medium=fi-colab&utm_campaign=fi-practicum&utm_content=glossary&utm_term=label#label), representing the fraction of annotators who labeled the comment either \"Very Toxic\" or \"Toxic.\" For example, if 10 annotators rated a comment, and 3 of them labeled it \"Very Toxic\" and 5 of them labeled it \"Toxic\", the comment would receive a toxicity score of `0.8`.\n",
        "\n",
        "**NOTE:** For more information on the Civil Comments labeling schema, see the [Data](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data) section of the Jigsaw Untended Bias in Toxicity Classification Kaggle competition.\n",
        "\n",
        "#### Example\n",
        "\n",
        "Here are the feature values for one example in the dataset:\n",
        "\n",
        "* **`comment_text`**: `i'm a white woman in my late 60's and believe me, they are not too crazy about me either!!`\n",
        "* **`female`**: `1.0`\n",
        "* **`white`**: `1.0`\n",
        "\n",
        "All raters tagged this comment with the labels `female` and `white`, giving the example scores of `1.0` for each of these identity mention labels.\n",
        "\n",
        "**NOTE:** All other identity labels (e.g., `male`, `asian`) had values of `0.0`.\n",
        "\n",
        "Here's the label for this example:\n",
        "\n",
        "* **`toxicity`**: `0.0`\n",
        "\n",
        "All raters labeled the above comment \"not toxic,\" which resulted in a toxicity label of `0.0`.\n",
        "\n",
        "### Preprocessing the data\n",
        "For the purposes of this exercise, we converted toxicity and identity columns to booleans in order to work with our neural net and metrics calculations. In the preprocessed dataset, we considered any value ≥ 0.5 as True (i.e., a comment is considered toxic if 50% or more crowd raters labeled it as toxic).\n",
        "\n",
        "For identity labels, the threshold 0.5 was chosen and the identities were grouped together by their categories. For example, if one comment has `{ male: 0.3, female: 1.0, transgender: 0.0, heterosexual: 0.8, homosexual_gay_or_lesbian: 1.0 }`, after processing, the data will be `{ gender: [female], sexual_orientation: [heterosexual, homosexual_gay_or_lesbian] }`.\n",
        "\n",
        "**NOTE:** Missing identity fields were converted to False.\n",
        "\n",
        "#### Example\n",
        "\n",
        "After preprocessing, here's the revised feature and label data for the example from above:\n",
        "\n",
        "* **`comment_text`**: `i'm a white woman in my late 60's and believe me, they are not too crazy about me either!!`\n",
        "* **`gender`**: `[female]`\n",
        "* **`race`**: `[white]`\n",
        "* **`disability`**: `[]`\n",
        "* **`religion`**: `[]`\n",
        "* **`sexual_orientation`**: `[]`\n",
        "* **`toxicity`**: `0.0`\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "0YNqAJW5JjZD"
      },
      "source": [
        "### Load the data\n",
        "\n",
        "We've posted copies of both the original Civil Comments dataset and our preprocessed data on Google Cloud Platform (in [TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord) format) to make it easy to import into this notebook.\n",
        "\n",
        "Run the following cell to download and import the training and validation datasets. By default, the following code will load the preprocessed data. If you prefer, you can enable the `download_original_data` checkbox at right to download the original dataset and preprocess it as described in the previous section (this may take 5-10 minutes).\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "duPWGTQAvYKK",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "download_original_data = False #@param {type:\"boolean\"}\n",
        "\n",
        "if download_original_data:\n",
        "  train_tf_file = tf.keras.utils.get_file('train_tf.tfrecord',\n",
        "                                          'https://storage.googleapis.com/civil_comments_dataset/train_tf.tfrecord')\n",
        "  validate_tf_file = tf.keras.utils.get_file('validate_tf.tfrecord',\n",
        "                                             'https://storage.googleapis.com/civil_comments_dataset/validate_tf.tfrecord')\n",
        "\n",
        "  # The identity terms list will be grouped together by their categories\n",
        "  # (see 'IDENTITY_COLUMNS') on threshould 0.5. Only the identity term column,\n",
        "  # text column and label column will be kept after processing.\n",
        "  train_tf_file = util.convert_comments_data(train_tf_file)\n",
        "  validate_tf_file = util.convert_comments_data(validate_tf_file)\n",
        "\n",
        "else:\n",
        "  train_tf_file = tf.keras.utils.get_file('train_tf_processed.tfrecord',\n",
        "                                          'https://storage.googleapis.com/civil_comments_dataset/train_tf_processed.tfrecord')\n",
        "  validate_tf_file = tf.keras.utils.get_file('validate_tf_processed.tfrecord',\n",
        "                                             'https://storage.googleapis.com/civil_comments_dataset/validate_tf_processed.tfrecord')"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aLup7wY0_Q3K",
        "colab_type": "text"
      },
      "source": [
        "### Explore the data distribution in TFDV\n",
        "\n",
        "Before we train the model, let's do a quick audit of our training data using [TensorFlow Data Validation](https://www.tensorflow.org/tfx/data_validation/get_started), so we can better understand our data distribution.\n",
        "\n",
        "**NOTE:** *The following cell may take 2–3 minutes to run.*"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "vkzcE_g8_m_h",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "stats = tfdv.generate_statistics_from_tfrecord(data_location=train_tf_file)\n",
        "tfdv.visualize_statistics(stats)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wZU1Djze6E-s",
        "colab_type": "text"
      },
      "source": [
        "### Exercise\n",
        "Use the TensorFlow Data Validation widget above to answer the following questions."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ne2_vKAb-XGD",
        "colab_type": "text"
      },
      "source": [
        "#### **1. How many total examples are in the training dataset?**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "UFBqqnRD-Zkj",
        "colab_type": "text"
      },
      "source": [
        "#### Solution\n",
        "\n",
        "Click below for the solution.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XSkOfchI-arC",
        "colab_type": "text"
      },
      "source": [
        "**There are 1.08 million total examples in the training dataset.**\n",
        "\n",
        "  The count column tells us how many examples there are for a given feature.  Each feature (`sexual_orientation`, `comment_text`, `gender`, etc.) has 1.08 million examples. The missing column tells us what percentage of examples are missing that feature. \n",
        "\n",
        "![Screenshot of first row of Categorical Features table in the TFDV widget, with 1.08 million count of examples and 0% missing examples highlighted](https://developers.google.com/machine-learning/practica/fairness-indicators/colab-images/tfdv_screenshot_exercise1.png)  \n",
        "  \n",
        "Each feature is missing from 0% of examples, so we know that the per-feature example count of 1.08 million is also the total number of examples in the dataset."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_PgFNm6sAZB2",
        "colab_type": "text"
      },
      "source": [
        "#### **2. How many unique values are there for the `gender` feature? What are they, and what are the frequencies of each of these values?**\n",
        "\n",
        "**NOTE #1:** `gender` and the other identity features (`sexual_orientation`, `religion`, `disability`, and `race`) are included in this dataset for evaluation purposes only, so we can assess model performance on different identity slices. The only feature we will use for model training is `comment_text`.\n",
        "\n",
        "**NOTE #2:** *We recognize the limitations of the categories used in the original dataset, and acknowledge that these terms do not encompass the full range of vocabulary used in describing gender.*"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6KmrCS-uAz0s",
        "colab_type": "text"
      },
      "source": [
        "#### Solution\n",
        "\n",
        "Click below for the solution."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wkc7P1nvA4cw",
        "colab_type": "text"
      },
      "source": [
        "The **unique** column of the **Categorical Features** table tells us that there are 4 unique values for the `gender` feature.\n",
        "\n",
        "To view the 4 values and their frequencies, we can click on the **SHOW RAW DATA** button:\n",
        "\n",
        "![\"gender\" row of the \"Categorical Data\" table in the TFDV widget, with raw data highlighted.](https://developers.google.com/machine-learning/practica/fairness-indicators/colab-images/tfdv_screenshot_exercise2.png)\n",
        "\n",
        "The raw data table shows that there are 32,208 examples with a gender value of `female`, 26,758 examples with a value of `male`, 1,551 examples with a value of `transgender`, and 4 examples with a value of `other gender`.\n",
        "\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NDUO57bdNUQR",
        "colab_type": "text"
      },
      "source": [
        "**NOTE:** As described [earlier](#scrollTo=J3R2QWkru1WN), a `gender` feature can contain zero or more of these 4 values, depending on the content of the comment. For example, a comment containing the text \"I am a transgender man\" will have both `transgender` and `male` as `gender` values, whereas a comment that does not reference gender at all will have an empty/false `gender` value."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wX62Ktwp-qoF",
        "colab_type": "text"
      },
      "source": [
        "#### **3. What percentage of total examples are labeled toxic? Overall, is this a class-balanced dataset (relatively even split of examples between positive and negative classes) or a class-imbalanced dataset (majority of examples are in one class)?**\n",
        "\n",
        "**NOTE:** In this dataset, a `toxicity` value of `0` signifies \"not toxic,\"  and a `toxicity` value of `1` signifies \"toxic.\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "IvvxNMgM-6A2",
        "colab_type": "text"
      },
      "source": [
        "#### Solution\n",
        "\n",
        "Click below for the solution."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QmCtkzZqOvC2",
        "colab_type": "text"
      },
      "source": [
        "**7.98 percent of examples are toxic.**\n",
        "\n",
        "Under **Numeric Features**, we can see the distribution of values for the `toxicity` feature. 92.02% of examples have a value of 0 (which signifies \"non-toxic\"), so 7.98% of examples are toxic.\n",
        "\n",
        "![Screenshot of the \"toxicity\" row in the Numeric Features table in the TFDV widget, highlighting the \"zeros\" column showing that 92.01% of examples have a toxicity value of 0.](https://developers.google.com/machine-learning/practica/fairness-indicators/colab-images/tfdv_screenshot_exercise3.png)\n",
        "\n",
        "This is a [**class-imbalanced dataset**](https://developers.google.com/machine-learning/glossary?utm_source=Colab&utm_medium=fi-colab&utm_campaign=fi-practicum&utm_content=glossary&utm_term=class-imbalanced-dataset#class-imbalanced-dataset), as the overwhelming majority of examples (over 90%) are classified as nontoxic."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9MGLCsVhGWz0",
        "colab_type": "text"
      },
      "source": [
        "#### **4. Run the following code to analyze label distribution for the subset of examples that contain a `gender` value**"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "f5pEWIkgLTKz",
        "colab_type": "code",
        "cellView": "form",
        "colab": {}
      },
      "source": [
        "#@title Calculate label distribution for gender-related examples\n",
        "raw_dataset = tf.data.TFRecordDataset(train_tf_file)\n",
        "\n",
        "toxic_gender_examples = 0\n",
        "nontoxic_gender_examples = 0\n",
        "\n",
        "# There are 1,082,924 examples in the dataset\n",
        "for raw_record in raw_dataset.take(1082924):\n",
        "  example = tf.train.Example()\n",
        "  example.ParseFromString(raw_record.numpy())\n",
        "  if str(example.features.feature[\"gender\"].bytes_list.value) != \"[]\":\n",
        "    if str(example.features.feature[\"toxicity\"].float_list.value) == \"[1.0]\":\n",
        "      toxic_gender_examples += 1\n",
        "    else:\n",
        "      nontoxic_gender_examples += 1\n",
        "\n",
        "print(\"Toxic Gender Examples: %s\" % toxic_gender_examples)\n",
        "print(\"Nontoxic Gender Examples: %s\" % nontoxic_gender_examples)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WJag4cEKNINy",
        "colab_type": "text"
      },
      "source": [
        "#### **What percentage of `gender` examples are labeled toxic? Compare this percentage to the percentage of total examples that are labeled toxic from #3 above. What, if any, fairness concerns can you identify based on this comparison?**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-J4hbOhgHZid",
        "colab_type": "text"
      },
      "source": [
        "#### Solution\n",
        "\n",
        "Click below for one possible solution."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2KK3VWzkHmJ7",
        "colab_type": "text"
      },
      "source": [
        "There are 7,189 gender-related examples that are labeled toxic, which represent 14.7% of all gender-related examples.\n",
        "\n",
        "The percentage of gender-related examples that are toxic (14.7%) is nearly double the percentage of toxic examples overall (7.98%). In other words, in our dataset, gender-related comments are almost two times more likely than comments overall to be labeled as toxic.\n",
        "\n",
        "This skew suggests that a model trained on this dataset might learn a correlation between gender-related content and toxicity. This raises fairness considerations, as the model might be more likely to classify nontoxic comments as toxic if they contain gender terminology, which could lead to [disparate impact](https://developers.google.com/machine-learning/glossary?utm_source=Colab&utm_medium=fi-colab&utm_campaign=fi-practicum&utm_content=glossary&utm_term=disparate-impact#disparate-impact) for gender subgroups. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5WHVYnlGX7g8",
        "colab_type": "text"
      },
      "source": [
        "## Part II: Train the model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Xi5nnJ3hX9OO",
        "colab_type": "text"
      },
      "source": [
        "In this section, you'll train a classifier on the Civil Comments dataset to predict whether a given text comment is toxic or not."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZCYUx0uVAz7v",
        "colab_type": "text"
      },
      "source": [
        "### Configure model input\n",
        "\n",
        "In order to feed data into our model, we'll need to define both a feature map and an input function. \n",
        "\n",
        "The feature map configures the features and label we'll be using, and their corresponding data types. The only feature we'll use for training is the `comment_text`. However, we'll also use the features for the five different identity categories (`sexual_orientation`, `gender`, `religion`, `race`, and `disability`) for evaluation purposes. Our label is `toxicity`, which (after our data preprocessing) has a value of either `0` (\"not toxic\") or `1` (\"toxic\")."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "SXqU-HJlB0fG",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "TEXT_FEATURE = 'comment_text'\n",
        "LABEL = 'toxicity'\n",
        "\n",
        "FEATURE_MAP = {\n",
        "    # Label:\n",
        "    LABEL: tf.io.FixedLenFeature([], tf.float32),\n",
        "    # Text:\n",
        "    TEXT_FEATURE:  tf.io.FixedLenFeature([], tf.string),\n",
        "\n",
        "    # Identities:\n",
        "    'sexual_orientation':tf.io.VarLenFeature(tf.string),\n",
        "    'gender':tf.io.VarLenFeature(tf.string),\n",
        "    'religion':tf.io.VarLenFeature(tf.string),\n",
        "    'race':tf.io.VarLenFeature(tf.string),\n",
        "    'disability':tf.io.VarLenFeature(tf.string),\n",
        "}"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "OlEoDYlvMABU",
        "colab_type": "text"
      },
      "source": [
        "The input function below specifies how to preprocess and batch the training data into the model. \n",
        "\n",
        "Because we uncovered a class imbalance when auditing our dataset with TFDV earlier, we'll preprocess our data to add a `weight` column to each example.\n",
        "We'll set the `weight` value for each example to `LABEL + 0.1`, resulting in a `weight` of 0.1 for nontoxic examples and a `weight` of 1.1 for toxic examples. During model training, TensorFlow will multiply each example's loss by its `weight`, *upweighting* the toxic examples by increasing the penalty for error in scoring a toxic example relative to the penalty for error in scoring a nontoxic example.\n",
        "\n",
        "Then we'll feed our data into the model in batches of 512 examples.\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ZaaKKjYwP6XY",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "def train_input_fn():\n",
        "  def parse_function(serialized):\n",
        "    parsed_example = tf.io.parse_single_example(\n",
        "        serialized=serialized, features=FEATURE_MAP)\n",
        "    # Adds a weight column to deal with unbalanced classes.\n",
        "    parsed_example['weight'] = tf.add(parsed_example[LABEL], 0.1)\n",
        "    return (parsed_example,\n",
        "            parsed_example[LABEL])\n",
        "  train_dataset = tf.data.TFRecordDataset(\n",
        "      filenames=[train_tf_file]).map(parse_function).batch(512)\n",
        "  return train_dataset"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "EX92OLF2MKSu"
      },
      "source": [
        "### Train the model\n",
        "\n",
        "Next, create a deep neural network model, and train it on the data. Run the below code to create a `DNNClassifier` model with 2 hidden layers.\n",
        "\n",
        "**NOTE:** For training, the only feature we will feed into the model is an embedding of our comment text (`embedded_text_feature_column`). The identity features we configured [above](#scrollTo=ZCYUx0uVAz7v) will only be used to assess model performance later on in the evaluation phase."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "2qW8TlgVRzFc",
        "colab": {}
      },
      "source": [
        "BASE_DIR = tempfile.gettempdir()\n",
        "\n",
        "model_dir = os.path.join(BASE_DIR, 'train', datetime.now().strftime(\n",
        "    \"%Y%m%d-%H%M%S\"))\n",
        "\n",
        "embedded_text_feature_column = hub.text_embedding_column(\n",
        "    key=TEXT_FEATURE,\n",
        "    module_spec='https://tfhub.dev/google/nnlm-en-dim128/1')\n",
        "\n",
        "classifier = tf.estimator.DNNClassifier(\n",
        "    hidden_units=[500, 100],\n",
        "    weight_column='weight',\n",
        "    feature_columns=[embedded_text_feature_column],\n",
        "    optimizer=tf.optimizers.Adagrad(learning_rate=0.003),\n",
        "    loss_reduction=tf.losses.Reduction.SUM,\n",
        "    n_classes=2,\n",
        "    model_dir=model_dir)\n",
        "\n",
        "classifier.train(input_fn=train_input_fn, steps=1000)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EjNbZGqFYCUy",
        "colab_type": "text"
      },
      "source": [
        "## Part III: Run Fairness Indicators\n",
        "\n",
        "In this section you'll use Fairness Indicators to evaluate the model's results for different subgroups of comments. Specifically, you'll take a closer look at performance for different gender categories."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EXd2-UzFbAwE",
        "colab_type": "text"
      },
      "source": [
        "### Export the model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jMYWdUaUbIb-",
        "colab_type": "text"
      },
      "source": [
        "First, let's export the model we trained in the [previous section](#scrollTo=5WHVYnlGX7g8), so that we can analyze the results using [TensorFlow Model Analysis (TFMA)](https://www.tensorflow.org/tfx/model_analysis/get_started)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "R-7dDws2bHxS",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "def eval_input_receiver_fn():\n",
        "  serialized_tf_example = tf.compat.v1.placeholder(\n",
        "      dtype=tf.string, shape=[None], name='input_example_placeholder')\n",
        "\n",
        "  # This *must* be a dictionary containing a single key 'examples', which\n",
        "  # points to the input placeholder.\n",
        "  receiver_tensors = {'examples': serialized_tf_example}\n",
        "\n",
        "  features = tf.io.parse_example(serialized_tf_example, FEATURE_MAP)\n",
        "  features['weight'] = tf.ones_like(features[LABEL])\n",
        "\n",
        "  return tfma.export.EvalInputReceiver(\n",
        "    features=features,\n",
        "    receiver_tensors=receiver_tensors,\n",
        "    labels=features[LABEL])\n",
        "\n",
        "tfma_export_dir = tfma.export.export_eval_savedmodel(\n",
        "  estimator=classifier,\n",
        "  export_dir_base=os.path.join(BASE_DIR, 'tfma_eval_model'),\n",
        "  eval_input_receiver_fn=eval_input_receiver_fn)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Xq8NWjzUcAhA",
        "colab_type": "text"
      },
      "source": [
        "### Compute fairness metrics\n",
        "\n",
        "Next, run the following code to compute fairness metrics on the model output. Here, we'll compute metrics on our 4 gender slices (`female`, `male`, `transgender`, and `other_gender`).\n",
        "\n",
        "**NOTE:** *Depending on your configurations, this step will take 2–10 minutes to run. For this exercise, we recommend leaving* `compute_confidence_intervals` *disabled to decrease computation time.*\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Pkuti2_8cR2D",
        "colab_type": "code",
        "cellView": "both",
        "colab": {}
      },
      "source": [
        "tfma_eval_result_path = os.path.join(BASE_DIR, 'tfma_eval_result')\n",
        "\n",
        "# NOTE: If you want to explore slicing by other categories, you can change\n",
        "# the slice_section value to \"sexual_orientation\", \"religion\", \"race\", \n",
        "# or \"disability\"\n",
        "slice_selection = 'gender' \n",
        "\n",
        "# Computing confidence intervals can help you make better decisions \n",
        "# regarding your data, but it requires computing multiple resamples, \n",
        "# so it takes significantly longer to run, particularly in Colab \n",
        "# (which cannot take advantage of parallelization), \n",
        "# so we leave it disabled here.\n",
        "compute_confidence_intervals = False\n",
        "\n",
        "# Define slices that you want the evaluation to run on.\n",
        "slice_spec = [\n",
        "    tfma.slicer.SingleSliceSpec(), # Overall slice\n",
        "    tfma.slicer.SingleSliceSpec(columns=[slice_selection]),\n",
        "]\n",
        "\n",
        "# Add the fairness metrics.\n",
        "add_metrics_callbacks = [\n",
        "  tfma.post_export_metrics.fairness_indicators(\n",
        "      thresholds=[0.1, 0.3, 0.5, 0.7, 0.9],\n",
        "      labels_key=LABEL\n",
        "      )\n",
        "]\n",
        "\n",
        "eval_shared_model = tfma.default_eval_shared_model(\n",
        "    eval_saved_model_path=tfma_export_dir,\n",
        "    add_metrics_callbacks=add_metrics_callbacks)\n",
        "\n",
        "# Run the fairness evaluation.\n",
        "with beam.Pipeline() as pipeline:\n",
        "  _ = (\n",
        "      pipeline\n",
        "      | 'ReadData' >> beam.io.ReadFromTFRecord(validate_tf_file)\n",
        "      | 'ExtractEvaluateAndWriteResults' >>\n",
        "       tfma.ExtractEvaluateAndWriteResults(\n",
        "                 eval_shared_model=eval_shared_model,\n",
        "                 slice_spec=slice_spec,\n",
        "                 compute_confidence_intervals=compute_confidence_intervals,\n",
        "                 output_path=tfma_eval_result_path)\n",
        "  )\n",
        "\n",
        "eval_result = tfma.load_eval_result(output_path=tfma_eval_result_path)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "BssBGpqmwBMd"
      },
      "source": [
        "Finally, render the Fairness Indicators widget with the exported evaluation results."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "HFlK9U40dyDk",
        "colab": {}
      },
      "source": [
        "widget_view.render_fairness_indicator(eval_result=eval_result, slicing_column=slice_selection)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wEMcXd8ZO_T4",
        "colab_type": "text"
      },
      "source": [
        "**NOTE:** The categories above are not mutually exclusive, as examples can be tagged with zero or more of these gender-identity terms. An example with gender values of both `transgender` and `female` will be represented in both the `gender:transgender` and `gender:female` slices."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "R-pA9MSPOl1_"
      },
      "source": [
        "### Exercise\n",
        "In the previous [TensorFlow Data Validation exercise](#scrollTo=wZU1Djze6E-s), we determined that the relatively small proportion of examples that had associated `gender` values combined with the class-imbalanced nature of the dataset might result in some bias in the model's predictions.\n",
        "\n",
        "Now that we've trained the model, we can actually evaluate for gender-related bias. In particular, we can take a closer look at gender-group performance on the following two metrics related to misclassifications:\n",
        "\n",
        "* **False positive rate (FPR)**, which tells us the percentage of actual \"not toxic\" comments that were incorrectly classified as \"toxic\"\n",
        "* **False negative rate (FNR)**, which tells us the percentage of actual \"toxic\" comments that were incorrectly classified as \"not toxic\"\n",
        "\n",
        "Use the Fairness Indicators widget above to answer the following questions."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YPDRjcKZk3EL",
        "colab_type": "text"
      },
      "source": [
        "#### **1. What are the overall false positive rate (FPR) and false negative rate (FNR) for the model at a classification threshold of 0.5?**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tGyACRd8oFwP",
        "colab_type": "text"
      },
      "source": [
        "#### Solution\n",
        "\n",
        "Click below for the solution."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QzmJGSTYsEYx",
        "colab_type": "text"
      },
      "source": [
        "Select a threshold of 0.5 in the dropdown at the the top of the widget. To view overall FPR results, enable the **post_export_metrics/false_positive_rate** checkbox in the left panel and locate the **Overall** value in the table below the bar graph. Similarly, to view FNR results for gender subgroups, enable the **post_export_metrics/false_negative_rate** checkbox in the left panel, and locate the **Overall** value in the table.\n",
        "\n",
        "Our results show that overall FPR\\@0.5 is 0.28, and overall FNR\\@0.5 is 0.27:\n",
        "\n",
        "![Side-by-side screenshot of FPR and FNR results in the Fairness Indicator widget. At left, Fairness Indicator results with the post_export_metrics/false_positive_rate metric enabled, and the Overall value of 0.28057 circled. At right, Fairness Indicator results with the post_export_metrics/false_negative_rate metric enabled, and the Overall value of 0.27339 circled](https://developers.google.com/machine-learning/practica/fairness-indicators/colab-images/fairness_indicators_colab1_exercise1.png)\n",
        "\n",
        "**NOTE:** *Model training is not deterministic, so your exact evaluation results here may vary slightly from ours.*\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nu8LXMy8nR8N",
        "colab_type": "text"
      },
      "source": [
        "#### **2. What are the FPR\\@0.5 and FNR\\@0.5 for the following gender subgroups:** \n",
        "  * `male`\n",
        "  * `female`"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FQGWSdrJy08B",
        "colab_type": "text"
      },
      "source": [
        "#### Solution\n",
        "\n",
        "Click below for the solution."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ox0tkciby1qq",
        "colab_type": "text"
      },
      "source": [
        "Select a threshold of 0.5 in the dropdown at the the top of the widget. To view FPR results for gender subgroups, enable the **post_export_metrics/false_positive_rate** checkbox in the left panel. To view FNR results for gender subgroups, enable the **post_export_metrics/false_negative_rate** checkbox in the left panel.\n",
        "\n",
        "**FPR\\@0.5**\n",
        "* `male`: 0.51\n",
        "* `female`: 0.47\n",
        "\n",
        "**FNR\\@0.5**\n",
        "* `male`: 0.13\n",
        "* `female`: 0.15\n",
        "\n",
        "**NOTE:** *Model training is not deterministic, so your exact evaluation results here may vary slightly from ours.*"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pvIhbl_vnsaE",
        "colab_type": "text"
      },
      "source": [
        "#### **3. What fairness considerations can you identify by comparing aggregate FPR and FNR from #1 above to subgroup FPR and FNR from #2 above?**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LlkfgynX0yfF",
        "colab_type": "text"
      },
      "source": [
        "#### Solution\n",
        "\n",
        "Click below for a solution."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bP9PEk36Br_g",
        "colab_type": "text"
      },
      "source": [
        "The **Diff w/ baseline** column in the Fairness Indicators widget tells us the percent difference between a given subgroup's metric performance and the aggregate (overall) metric performance.\n",
        "\n",
        "False negative rate is lower for both male and female subgroups (–51% and –45%, respectively) than it is overall. In other words, the model is less likely to misclassify a male- or female-related toxic comment as \"nontoxic\" than it is to misclassify toxic comments as \"nontoxic\" overall.\n",
        "\n",
        "![Fairness Indicators results at a Threshold of 0.5 with the post_export_metrics/false_negative_rate metric selected. The \"Diff w. baseline\" results for male and female genders are highlighted: –51.43998% for male and –45.38494 for female.](https://developers.google.com/machine-learning/practica/fairness-indicators/colab-images/fairness_indicators_colab1_exercise3_fnr.png)\n",
        "\n",
        "In contrast, the false positive rate is higher for both male and female subgroups (+83% and +69%) than it is overall. In other words, the model is more likely to misclassify a male- or female-related nontoxic comment as \"toxic\" than it is to misclassify nontoxic comments as \"toxic\" overall.\n",
        "\n",
        "![Fairness Indicators results at a Threshold of 0.5 with the post_export_metrics/false_positive_rate metric selected. The \"Diff w. baseline\" results for male and female genders are highlighted: +82.83489% for male and +68.68737 for female.](https://developers.google.com/machine-learning/practica/fairness-indicators/colab-images/fairness_indicators_colab1_exercise3_fpr.png)\n",
        "\n",
        "**NOTE:** *Model training is not deterministic, so your exact evaluation results here may vary slightly from ours below.*\n",
        "\n",
        "This higher FPR raises issues of fairness that should be remediated. If gender-related comments are more likely to be misclassified as \"toxic,\" then in practice, this could result in gender discourse being disproportionally suppressed."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "0gEH2xBvRzYi"
      },
      "source": [
        "## Part IV: Dig Deeper into the Data\n",
        "\n",
        "In this section, you'll use the [What-If Tool's](https://pair-code.github.io/what-if-tool/) interactive visual interface to improve your understanding of how the toxic text classifier classifies individual examples, from which you can extrapolate larger insights.\n",
        "\n",
        "#### **WARNING: When you launch the What-If tool widget below, the left panel will display the full text of individual comments from the Civil Comments dataset. Some of these comments include profanity, offensive statements, and offensive statements involving identity terms. Feel free to skip Part IV if this is a concern.**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "rGQGxt8G-p3y"
      },
      "source": [
        "Launch What-If Tool with 1,000 training examples displayed:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "9nFGBYRj-erY",
        "colab": {}
      },
      "source": [
        "# Limit the number of examples to 1000, so that data loads successfully\n",
        "# for most browser/machine configurations. \n",
        "DEFAULT_MAX_EXAMPLES = 1000\n",
        "\n",
        "# Load 100000 examples in memory.\n",
        "def wit_dataset(file, num_examples=100000):\n",
        "  dataset = tf.data.TFRecordDataset(\n",
        "      filenames=[train_tf_file]).take(num_examples)\n",
        "  return [tf.train.Example.FromString(d.numpy()) for d in dataset]\n",
        "\n",
        "wit_data = wit_dataset(train_tf_file)\n",
        "\n",
        "# Configure WIT with 1000 examples, the FEATURE_MAP we defined above, and\n",
        "# a label of 1 for positive (toxic) examples and 0 for negative (nontoxic)\n",
        "# examples\n",
        "config_builder = WitConfigBuilder(wit_data[:DEFAULT_MAX_EXAMPLES]).set_estimator_and_feature_spec(\n",
        "    classifier, FEATURE_MAP).set_label_vocab(['0', '1']).set_target_feature(LABEL)\n",
        "wit = WitWidget(config_builder)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "odeh6SsDUKYq"
      },
      "source": [
        "### Exercise\n",
        "\n",
        "Use the What-If tool to complete the following tasks and answer the associated questions."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "N8klOQTWMbO5",
        "colab_type": "text"
      },
      "source": [
        "#### Task 1\n",
        "\n",
        "Using the **Binning**, **Color By**, **Label by**, and **Scatter** dropdowns at the top of the What-If widget, create a visualization that groups examples by gender, and displays both how each example was categorized (`Inference label`) by the model and whether the classification was correct (`Inference correct`)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "FBhBsevUOinO"
      },
      "source": [
        "#### Solution\n",
        "\n",
        "Click below for one possible solution."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "aKbl_6FNOnf1"
      },
      "source": [
        "Here is one possible configuration that groups examples by gender, visualizing both the inference label and whether inference was correct or incorrect for each example:\n",
        "\n",
        "**NOTE:** *Model training is not deterministic, so your exact results in each category may vary slightly from ours.*\n",
        "\n",
        "![Visualization of the What-If tool widget. Description of the visualization follows.](https://developers.google.com/machine-learning/practica/fairness-indicators/colab-images/wit_colab1_exercise1.png)\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "MaQx5uWJCN3M"
      },
      "source": [
        "In the above visualization, first we set **Binning | Y-Axis** to `gender` to bucket examples by gender on the vertical axis. (We've set both **Scatter** dropdowns to default to clump all the data points together, but you could also scatter by `Inference correct` or `Inference label` to split apart different classifications within each gender group)\n",
        "\n",
        "Next, we set **Color By** to `Inference corrrect` to color-code each example by whether the inference correctly predicted the ground-truth label. Correct predictions are colored blue, and incorrect predictions are colored red.\n",
        "\n",
        "Finally, we set **Label By** to `Inference label` to add a text label to each example that indicates how the model classified the example. Examples that the model classified positive (toxic) are labeled `1`, and examples that the model classified negative (non-toxic) are labeled `0`."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "cwTzull2U16-"
      },
      "source": [
        "### Task 2\n",
        "\n",
        "Use the visualization you created in **Task 1** to locate the false positives (examples where the ground-truth label is \"nontoxic\" but the model predicted \"toxic\") in the `female` bucket. How many false positives are there?"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "P5MBQR7EF6ny"
      },
      "source": [
        "#### Solution\n",
        "\n",
        "Click below for the solution."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "IXuM7rqXcjTD"
      },
      "source": [
        "False positives are the red examples labeled `1` (outlined in yellow below).\n",
        "In our visualization, there are 5 false positives in the `female` bucket.\n",
        "\n",
        "![Zoomed-in view of the \"female\" section of the gender visualization from Task 1, with  5 false positives outlined in yellow.](https://developers.google.com/machine-learning/practica/fairness-indicators/colab-images/wit_colab1_exercise2.png)\n",
        "\n",
        "**NOTE:** *Model training is not deterministic, so your false-positive count may vary slightly from ours.*"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "19OAoAraVTx6"
      },
      "source": [
        "### Task 3\n",
        "\n",
        "Can you determine what aspects of the comment text might have influenced the model to incorrectly predict the positive class for the examples you found in **Task 2**? \n",
        "\n",
        "Click on one of the false positives you found, and make some edits to the text in the `comment_text` field in the left panel. Then click **Run inference** below to see what label the model predicts for the revised text. What changes in the text will result in the model predicting a lower toxicity score?"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "OaL3qgHCcmwG"
      },
      "source": [
        "#### Solution\n",
        "\n",
        "Click below for one possible avenue to pursue."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "R2bZ5_rPcrGW"
      },
      "source": [
        "Try removing gender identity terms from the comments (e.g., `women`, `girl`), and see if that results in lower toxicity scores."
      ]
    }
  ]
}